34 research outputs found

    On the Benefit of Merging Suffix Array Intervals for Parallel Pattern Matching

    Get PDF
    We present parallel algorithms for exact and approximate pattern matching with suffix arrays, using a CREW-PRAM with pp processors. Given a static text of length nn, we first show how to compute the suffix array interval of a given pattern of length mm in O(mp+lgp+lglgplglgn)O(\frac{m}{p}+ \lg p + \lg\lg p\cdot\lg\lg n) time for pmp \le m. For approximate pattern matching with kk differences or mismatches, we show how to compute all occurrences of a given pattern in O(mkσkpmax(k,lglgn) ⁣+ ⁣(1+mp)lgplglgn+occ)O(\frac{m^k\sigma^k}{p}\max\left(k,\lg\lg n\right)\!+\!(1+\frac{m}{p}) \lg p\cdot \lg\lg n + \text{occ}) time, where σ\sigma is the size of the alphabet and pσkmkp \le \sigma^k m^k. The workhorse of our algorithms is a data structure for merging suffix array intervals quickly: Given the suffix array intervals for two patterns PP and PP', we present a data structure for computing the interval of PPPP' in O(lglgn)O(\lg\lg n) sequential time, or in O(1+lgplgn)O(1+\lg_p\lg n) parallel time. All our data structures are of size O(n)O(n) bits (in addition to the suffix array)

    Dynamic Skyline Computation with the Skyline Breaker Algorithm

    Get PDF
    Given a sequential data input, we tackle parallel dynamic skyline computation of the read data by means of a spatial tree structure for indexing fine-grained feature vectors. For this purpose, we modified the Skyline Breaker algorithm that solves skyline computation with multiple local split decision trees concurrently. With this approach, we propose an algorithm for dynamic skyline computation that inherits the robustness against the dimension curse and different data distributions

    On the Benefit of Merging Suffix Array Intervals for Parallel Pattern Matching

    Get PDF
    We present parallel algorithms for exact and approximate pattern matching with suffix arrays, using a CREW-PRAM with p processors. Given a static text of length n, we first show how to compute the suffix array interval of a given pattern of length m in O(m/p + lg p + lg lg p * lg lg n) time for p <= m. For approximate pattern matching with k differences or mismatches, we show how to compute all occurrences of a given pattern in O((m^k sigma^k)/p max (k, lg lg n) + (1+m/p) lg p * lg lg n + occ} time, where sigma is the size of the alphabet and p <= sigma^k m^k. The workhorse of our algorithms is a data structure for merging suffix array intervals quickly: Given the suffix array intervals for two patterns P and P\u27, we present a data structure for computing the interval of PP\u27 in O(lg lg n) sequential time, or in O(1 + lg_p lg n) parallel time. All our data structures are of size O(n) bits (in addition to the suffix array)

    c-trie++: A Dynamic Trie Tailored for Fast Prefix Searches

    Full text link
    Given a dynamic set KK of kk strings of total length nn whose characters are drawn from an alphabet of size σ\sigma, a keyword dictionary is a data structure built on KK that provides locate, prefix search, and update operations on KK. Under the assumption that α=w/lgσ\alpha = w / \lg \sigma characters fit into a single machine word ww, we propose a keyword dictionary that represents KK in nlgσ+Θ(klgn)n \lg \sigma + \Theta(k \lg n) bits of space, supporting all operations in O(m/α+lgα)O(m / \alpha + \lg \alpha) expected time on an input string of length mm in the word RAM model. This data structure is underlined with an exhaustive practical evaluation, highlighting the practical usefulness of the proposed data structure, especially for prefix searches - one of the most elementary keyword dictionary operations
    corecore